AITopics

2511.17502

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

arXiv.org Artificial IntelligenceJun-27-2025

WorldVLA: Towards Autoregressive Action World Model

Cen, Jun, Yu, Chaohui, Yuan, Hangjie, Jiang, Yuming, Huang, Siteng, Guo, Jiayan, Li, Xin, Song, Yibing, Luo, Hao, Wang, Fan, Zhao, Deli, Chen, Hao

We present WorldVLA, an autoregressive action world model that unifies action and image understanding and generation. Our WorldVLA intergrates Vision-Language-Action (VLA) model and world model in one single framework. The world model predicts future images by leveraging both action and image understanding, with the purpose of learning the underlying physics of the environment to improve action generation. Meanwhile, the action model generates the subsequent actions based on image observations, aiding in visual understanding and in turn helps visual generation of the world model. We demonstrate that WorldVLA outperforms standalone action and world models, highlighting the mutual enhancement between the world model and the action model. In addition, we find that the performance of the action model deteriorates when generating sequences of actions in an autoregressive manner. This phenomenon can be attributed to the model's limited generalization capability for action prediction, leading to the propagation of errors from earlier actions to subsequent ones. To address this issue, we propose an attention mask strategy that selectively masks prior actions during the generation of the current action, which shows significant performance improvement in the action chunk generation task.

artificial intelligence, arxiv preprint arxiv, world model, (15 more...)

2506.21539

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

arXiv.org Artificial IntelligenceSep-16-2023

DOMAIN: MilDly COnservative Model-BAsed OfflINe Reinforcement Learning

Liu, Xiao-Yin, Zhou, Xiao-Hu, Xie, Xiao-Liang, Liu, Shi-Qi, Feng, Zhen-Qiu, Li, Hao, Gui, Mei-Jiang, Xiang, Tian-Yu, Huang, De-Xing, Hou, Zeng-Guang

Model-based reinforcement learning (RL), which learns environment model from offline dataset and generates more out-of-distribution model data, has become an effective approach to the problem of distribution shift in offline RL. Due to the gap between the learned and actual environment, conservatism should be incorporated into the algorithm to balance accurate offline data and imprecise model data. The conservatism of current algorithms mostly relies on model uncertainty estimation. However, uncertainty estimation is unreliable and leads to poor performance in certain scenarios, and the previous methods ignore differences between the model data, which brings great conservatism. Therefore, this paper proposes a milDly cOnservative Model-bAsed offlINe RL algorithm (DOMAIN) without estimating model uncertainty to address the above issues. DOMAIN introduces adaptive sampling distribution of model samples, which can adaptively adjust the model data penalty. In this paper, we theoretically demonstrate that the Q value learned by the DOMAIN outside the region is a lower bound of the true Q value, the DOMAIN is less conservative than previous model-based offline RL algorithms and has the guarantee of security policy improvement. The results of extensive experiments show that DOMAIN outperforms prior RL algorithms on the D4RL dataset benchmark, and achieves better performance than other RL algorithms on tasks that require generalization.

algorithm, dataset, model data, (13 more...)

2309.08925

Country:

Asia > Macao (0.14)
Asia > China > Beijing > Beijing (0.04)
North America > United States (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.73)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Plaza, C. Westendorp, Ramos, A. Asensio, Prieto, C. Allende

High-precision interpolation of stellar atmospheres with a deep neural network using a 1D convolutional auto encoder for feature extraction

arXiv.org Machine LearningJun-12-2023

Given the widespread availability of grids of models for stellar atmospheres, it is necessary to recover intermediate atmospheric models by means of accurate techniques that go beyond simple linear interpolation and capture the intricacies of the data. Our goal is to establish a reliable, precise, lightweight, and fast method for recovering stellar model atmospheres, that is to say the stratification of mass column, temperature, gas pressure, and electronic density with optical depth given any combination of the defining atmospheric specific parameters: metallicity, effective temperature, and surface gravity, as well as the abundances of other key chemical elements. We employed a fully connected deep neural network which in turn uses a 1D convolutional auto-encoder to extract the nonlinearities of a grid using the ATLAS9 and MARCS model atmospheres. This new method we call iNNterpol effectively takes into account the nonlinearities in the relationships of the data as opposed to traditional machine-learning methods, such as the light gradient boosting method (LightGBM), that are repeatedly used for their speed in well-known competitions with reduced datasets. We show a higher precision with a convolutional auto-encoder than using principal component analysis as a feature extractor.We believe it constitutes a useful tool for generating fast and precise stellar model atmospheres, mitigating convergence issues, as well as a framework for future developments. The code and data for both training and direct interpolation are available online at https://github.com/cwestend/iNNterpol for full reproducibility and to serve as a practical starting point for other continuous 1D data in the field and elsewhere.

artificial intelligence, innterpol, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1051/0004-6361/202346372

2306.06938

Country:

Europe > Spain > Canary Islands > Tenerife (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceAug-1-2022

PatrickStar: Parallel Training of Pre-trained Models via Chunk-based Memory Management

Fang, Jiarui, Zhu, Zilin, Li, Shenggui, Su, Hui, Yu, Yang, Zhou, Jie, You, Yang

The pre-trained model (PTM) is revolutionizing Artificial Intelligence (AI) technology. However, the hardware requirement of PTM training is prohibitively high, making it a game for a small proportion of people. Therefore, we proposed PatrickStar system to lower the hardware requirements of PTMs and make them accessible to everyone. PatrickStar uses the CPU-GPU heterogeneous memory space to store the model data. Different from existing works, we organize the model data in memory chunks and dynamically distribute them in the heterogeneous memory. Guided by the runtime memory statistics collected in a warm-up iteration, chunks are orchestrated efficiently in heterogeneous memory and generate lower CPU-GPU data transmission volume and higher bandwidth utilization. Symbiosis with the Zero Redundancy Optimizer, PatrickStar scales to multiple GPUs on multiple nodes. % using data parallelism. The system can train tasks on bigger models and larger batch sizes, which cannot be accomplished by existing works. Experimental results show that PatrickStar extends model scales 2.27 and 2.5 times of DeepSpeed, and consistently exhibits significantly higher execution speed. PatricStar also successfully runs the 175B GPT3 training task on a 32 GPU cluster. Our code is publicly available at https://github.com/Tencent/PatrickStar.

machine learning, natural language, tensor, (18 more...)

doi: 10.1109/TPDS.2022.3219819

2108.05818

Country:

Asia > Singapore (0.04)
Asia > China > Beijing > Beijing (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceJan-20-2022, 21:56:55 GMT

Airborne LiDAR-assisted deep learning methodology for riparian land cover classification using aerial photographs and its application for flood modelling

In response to challenges in land cover classification (LCC), many researchers have experimented recently with classification methods based on artificial intelligence techniques. For LCC mapping of the vegetated Asahi River in Japan, the current study uses deep learning (DL)-based DeepLabV3 module for image segmentation of aerial photographs. We modified the existing model by concatenating data on its resultant output port to access the airborne laser bathymetry (ALB) dataset, including voxel-based laser points and vegetation height (i.e. Findings revealed that the modified approach improved the accuracy of LCC greatly compared to our earlier unsupervised ALB-based method, with 25 and 35% improvement, respectively, in overall accuracy and the macro F1-score for November 2017 dataset (no–leaf condition). Finally, by estimating flow-resistance parameters in flood modelling using LCC mapping-derived data, we conclude that the upgraded DL methodology produces better fit between numerically analyzed and observed peak water levels.

airborne lidar-assisted deep learning methodology, land cover classification, riparian land cover classification, (4 more...)

Country: Asia > Japan (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

#artificialintelligenceJan-11-2022, 08:06:20 GMT

Machine learning: Apache Flink ML 2.0 opens for Python - Market Research Telecast

The team behind Apache Flink has released Apache Flink ML in version 2.0. This is an accompanying library for machine learning purposes for the framework for processing data streams. Apache Flink ML provides both APIs and infrastructure to build stream-batch unified ML algorithms. These should be easy to use and offer almost real-time latency. The current release should make a significant contribution to expanding Apache Flink to new use cases from the machine learning area, in particular real-time ML scenarios.

apache flink ml 2, library, market research telecast, (6 more...)

Industry: Marketing (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.92)

#artificialintelligenceFeb-24-2021, 18:43:07 GMT

Global Big Data Conference

Aquarium, a startup from two former Cruise employees, wants to help companies refine their machine learning model data more easily and move the models into production faster. Today the company announced a $2.6 million seed led by Sequoia with participation from Y Combinator and a bunch of angel investors including Cruise co-founders Kyle Vogt and Dan Kan. When the two co-founders CEO Peter Gao and head of engineering Quinn Johnson, were at Cruise they learned that finding areas of weakness in the model data was often the problem that prevented it from getting into production. Aquarium aims to solve this issue. "Aquarium is a machine learning data management system that helps people improve model performance by improving the data that it's trained on, which is usually the most important part of making the model work in production," Gao told me.

aquarium, global big data conference, model data, (1 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.40)

#artificialintelligenceJan-4-2020, 02:24:08 GMT

How Do You Test AI Systems?

Everyone who has ever worked on an application development project knows that you don't just simply put code and content out in production, to your customers, employees, or stakeholders without first testing it to make sure it's not broken or dead on delivery. Quality Assurance (QA) is such a core part of any technology or business delivery that it's one of the essential components of any development methodology. And the best way to do all this is in an agile fashion, in small, iterative chunks so you make sure to respond to the continuously evolving and changing needs of the customer. Surely AI projects are no different. There are iterative design, development, testing, and delivery phases, as we've discussed in our previous content on AI methodologies.

ai model, algorithm, training data, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

#artificialintelligenceOct-28-2019, 20:06:01 GMT

When Is it Okay to Use Data for AI?

Developing AI requires a lot of data and, in many cases, this data comes from third parties. But organizations willing to share data for computational uses have not had easy-to-use licenses for distributing data. Many common licenses, such as the Creative Commons licenses, were developed without consideration for how data could be used for machine learning. The absence of model data sharing agreements has put off many data owners who would otherwise be eager to share their data, thus hindering AI development. To address this problem, Microsoft has published three model data use agreements designed to address if and how data could be used for AI development.

agreement, ai development, share data, (10 more...)

Technology: Information Technology > Artificial Intelligence (1.00)